OmniPaper Smart Information Retrieval Prototype
نویسندگان
چکیده
The OmniPaper project has implemented several information retrieval prototypes in the area of electronic news publishing. One prototype uses SOAP as communication protocol between the central system and a number of distributed news archives. The second prototype uses an RDF metadata database, enabling direct metadata queries to the central system. Finally the Topic Map prototype uses query expansion and semantic linking for smart metadata search. The Topic Map prototype enhances the search experience by implementing a knowledge layer that combines the semantic content of a lexical database, consisting of concepts and keywords, with a metadata-set of newspaper articles. After developing and testing three smaller prototypes, the OmniPaper consortium has combined these prototypes in one. In this final prototype a kind of " enhanced full-text search " engine is implemented. This means that the prototype is an interface on top of existing search engines. When a user submits a query, this query is forwarded to several distributed news archives to retrieve relevant news articles. Next to this, the system: 1) translates queries to enable multilingual search, 2) provides a query refinement mechanism, both in graphic and text-based form, allowing users to adapt their query and 3) provides uniform result ranking algorithm across the different news archives. In this prototype querying and navigation are considered as alternative methods to find relevant information. Both interact with each other and together they produce a combined user experience that can be expressed as find what you were looking for and then browse away from it. In fact, the prototype considers both querying and navigation as a kind of search action and tries to integrate both. In concrete, keywords in a query are looked up in a dictionary and shown to the user. In the background, the keywords are translated and expanded to related terms. These expanded queries are sent to the underlying full-text search engine(s) in all requested languages. In the graphical tool (" web of concepts ") users can redefine the meaning of their query words, resulting in an updated query and result set. Both disambiguation (choosing one meaning of a word out of many) and refinement (browsing to related words) are possible. Figure 1 shows the web of concepts for the query " poll Indonesia ". The word " Indonesia " is recognized in only one concept, " Dutch East Indies " , whereas the word " poll " has many different meanings. …
منابع مشابه
The Omnipaper Metadata RDF/XML Prototype Implementation
Omnipaper (Smart Access to European Newspapers, IST-2001-32174) is a project from the European Commission IST program (Information Society Technologies) that investigates and proposes ways for access to different types of distributed information sources. This article intends to introduce the technology Resource Description Framework RDF, developed by W3C for the Web based on metadata, and its p...
متن کاملSmart Search in Newspaper Archives Using Topic Maps
The OmniPaper project has implemented three information retrieval prototypes in the area of electronic news publishing. One prototype uses SOAP as communication protocol between the central system and a number of distributed news archives. The second prototype uses an RDF metadata database, enabling direct metadata queries to the central system. Finally the Topic Map prototype uses query expans...
متن کاملThe instantiation of OmniPaper RDF prototype in the context of scientific publications
Purpose of this paper The purpose of this paper is to present an instance of the system developed in the OmniPaper project, regarding the mechanisms of distributed information retrieval. These mechanisms were developed for newspapers’ articles and they were then instantiated in the context of the scientific publication. Another goal concerns the use of a central metadatabase developed to accomp...
متن کاملThe Extension Of The Omnipaper System In The Context Of Scientific Publications
Today the Internet is an important information source, which facilitates the search and access to information contents on the Web. In fact, the Internet has become an important tool used daily by scholars in the development of their work. However the contents published on the Web increase daily and consequently difficult the identification of new contents published in various information source...
متن کاملIncorporating a Semantically Enriched Navigation Layer onto an Rdf Metadatabase
Information Society Technologies (IST) funded Omnipaper project, proposes to investigate efficient ways to enable an access to distributed, and heterogeneous digital news archives through the use of state-of-the-art technologies such as RDF, and XTM. In the Omnipaper project we intend to achieve the implementation of a final prototype that enables users (professional journalists and occasional ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005